AITopics | time horizon

Despite the fact that experimental neural scaling laws have substantially guided empirical progress in large-scale machine learning, no existing theory can quantitatively predict the exponents of these important laws for any modern LLM trained on any natural language dataset. We provide the first such theory in the case of data-limited scaling laws. We isolate two key statistical properties of language that alone can predict neural scaling exponents: (i) the decay of pairwise token correlations with time separation between token pairs, and (ii) the decay of the next-token conditional entropy with the length of the conditioning context. We further derive a simple formula in terms of these statistics that predicts data-limited neural scaling exponents from first principles without any free parameters or synthetic data models. Our theory exhibits a remarkable match with experimentally measured neural scaling laws obtained from training GPT-2 and LLaMA style models from scratch on two qualitatively different benchmarks, TinyStories and WikiText.

large language model, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2602.07488

Country:

North America > United States > California > Santa Clara County > Stanford (0.14)
Europe > Switzerland > Vaud > Lausanne (0.04)
North America > United States > Maryland > Baltimore (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

e995f98d56967d946471af29d7bf99f1-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 17:18:37 GMT

agent, dynamic environment, mechanism, (17 more...)

Neural Information Processing Systems

Country: South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.42)

Add feedback

Automated Dynamic Mechanism Design

Neural Information Processing SystemsFeb-11-2026, 17:18:33 GMT

For environments with large time horizons, we show that the principal's optimal

artificial intelligence, machine learning, mechanism, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.44)

Add feedback

4c454d34f3a4c8d6b4ca85a918e5d7ba-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 12:57:25 GMT

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

b7da6669894867f04b8727876a69ffc0-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 21:01:50 GMT

algorithm, fairness, sequence, (17 more...)

Neural Information Processing Systems

Country:

South America > Chile (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

670c26185a3783678135b4697f7dbd1a-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 17:25:26 GMT

Our goal is to design algorithms that can automatically adapt to theunknown hardness of the problem,i.e.,thenumberofbestarms.

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

This is the most misunderstood graph in AI

MIT Technology ReviewFeb-5-2026, 10:00:00 GMT

To some, METR's "time horizon plot" indicates that AI utopia--or apocalypse--is close at hand. The truth is more complicated. Every time OpenAI, Google, or Anthropic drops a new frontier large language model, the AI community holds its breath. It doesn't exhale until METR, an AI research nonprofit whose name stands for "Model Evaluation & Threat Research," updates a now-iconic graph that has played a major role in the AI discourse since it was first released in March of last year. The graph suggests that certain AI capabilities are developing at an exponential rate, and more recent model releases have outperformed that already impressive trend. That was certainly the case for Claude Opus 4.5, the latest version of Anthropic's most powerful model, which was released in late November.

artificial intelligence, large language model, natural language, (14 more...)

MIT Technology Review

Country: North America > United States > Illinois (0.14)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Add feedback

Distributed Online Convex Optimization with Compressed Communication

Neural Information Processing SystemsDec-25-2025, 12:08:00 GMT

We consider a distributed online convex optimization problem when streaming data are distributed among computing agents over a connected communication network. Since the data are high-dimensional or the network is large-scale, communication load can be a bottleneck for the efficiency of distributed algorithms. To tackle this bottleneck, we apply the state-of-art data compression scheme to the fundamental GD-based distributed online algorithms. Three algorithms with difference-compressed communication are proposed for full information feedback (DC-DOGD), one-point bandit feedback (DC-DOBD), and two-point bandit feedback (DC-DO2BD), respectively. We obtain regret bounds explicitly in terms of time horizon, compression ratio, decision dimension, agent number, and network parameters. Our algorithms are proved to be no-regret and match the same regret bounds, w.r.t.

algorithm, name change, online convex optimization, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Communications > Networks (0.98)
Information Technology > Artificial Intelligence > Machine Learning (0.64)

Add feedback

Filters

Collaborating Authors

time horizon

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

e5b294b70c9647dcf804d7baa1903918-AuthorFeedback.pdf

Nonparametric Contextual Bandits in Metric Spaces with Unknown Metric

Deriving Neural Scaling Laws from the statistics of natural language

e995f98d56967d946471af29d7bf99f1-Supplemental.pdf

Automated Dynamic Mechanism Design

4c454d34f3a4c8d6b4ca85a918e5d7ba-Supplemental-Conference.pdf

b7da6669894867f04b8727876a69ffc0-Paper.pdf

670c26185a3783678135b4697f7dbd1a-Supplemental.pdf

This is the most misunderstood graph in AI

Distributed Online Convex Optimization with Compressed Communication